Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quadtree region class - QuadtreeGrid2D #115

Merged
merged 56 commits into from
Jan 11, 2022

Conversation

khawajasim
Copy link
Contributor

QuadtreeGrid2D region class added along with some other helping functions.

I still need to make "_bin_catalog_spatial_counts(....)" work. For that I need to get idx_map, xs, ys and mask. Any suggestions for this?

@khawajasim khawajasim changed the title Quadtree master Quadtree region class - QuadtreeGrid2D Apr 19, 2021
@wsavran
Copy link
Collaborator

wsavran commented Apr 19, 2021

we need to replace that function with something that isn't specifically designed for the cartesian region. ultimately we want to provide a binwise_counts function on the region class. so i would add a function to the quadtree class called get_binwise_counts() that accepts a list of lons and lats and returns in the counts corresponding to the polygons in the quadtree class.

also, it looks like the build is failing because we haven't specified mercantile in the requirements.txt and requirements.yaml files in the top-level directory.

can you also explain a bit how this package called area works to calculate the area given lat/lon points? maybe this would be good to put in some documentation about this class.

@khawajasim
Copy link
Contributor Author

Thank you for your comments.

I have a question. I understand that I need to write two different functions that say, e.g. "get_spatial_counts" and "get_spatio_magnitude_counts". However, I am not entirely sure where to put these functions? Should it be a part of Catalog class or QuadtreeGrid2D class? I guess in current implementation a wrapper function is called from catalog class. What is your opinion?

@wsavran
Copy link
Collaborator

wsavran commented Apr 20, 2021

good question. now that im thinking about it more, i think the best would be to modify the function in the catalog class to work with the region object directly. right now it calls this _bin_catalog_spatial_counts function, instead of working directly with the region object. once we implement this behavior, we can just ditch these _bin_catalog_spatial_counts functions.

the general procedure of the function would be to use the get_index_of() function of the region class to get the indexes and then increment the counts of a numpy.array(). we can then use the bin1d_vec() function to compute the magnitude index (because these are regularly spaced). basically, we just need to take the function that exists and make it more general to work with any region object that implements the get_index_of() function.

@khawajasim
Copy link
Contributor Author

I have identifed an issue with bin1d_vec() function. I am not sure whether this is actually an issue or not. it takes in mbins and magnitude and provides the index of the bin, where magnitude lies.

from csep.utils.calc import bin1d_vec
mbins = numpy.arange(5.95, 9, 0.1) #This gives bins from 5.95 to 8.95
idx = bin1d_vec(mm, mbins, tol=0.00001, right_continuous=True)

so it returns the magnitude bin of mm.
if mm is actually lying inside a mbins, it would return its bin.

idx = bin1d_vec(6, mbins, tol=0.00001, right_continuous=True) #This would give 0: Which is fine.

But if, I give some magnitude that is below the mbins, e.g. mm=4 or 5. It would start negative counting beyond 5.95.

idx = bin1d_vec(5, mbins, tol=0.00001, right_continuous=True)
this gives -10

idx = bin1d_vec(4, mbins, tol=0.00001, right_continuous=True)
this gives -20

Ideally, it should be returning zero. Is it something you desired? or is it a bug?

@wsavran
Copy link
Collaborator

wsavran commented Apr 21, 2021

this expected behavior is that if the value falls outside the range it returns -1. the downside is that this is a valid index in python, and some issues arised from returning None there. the fact its returning -10 is not expected, but practically should not cause any issues it should return -1. this only happens with the right_continuous flag.

this issue only affects scalarr values, if you were to test with an array the -1 should be returned as expected. ill submit a pr to fix this in the next bit.

@khawajasim
Copy link
Contributor Author

This may cause problem for the following line. Because -1 or -10 are valid indices. So we may need additional check, whether catalog is filtered for the minimum magnitude or not.

numpy.add.at(out, idx, 1)

@wsavran
Copy link
Collaborator

wsavran commented Apr 21, 2021

yes, you should check whether the bins values are correct. you will notice both of the spatial counts functions have some sort of check inside them. these checks also account for potentially masked cells. see the _bin_catalog_spatial_counts function.

@wsavran
Copy link
Collaborator

wsavran commented Apr 21, 2021

we should probably add a check to the magitude_counts function in the catalog class though. i think that is maybe where you found that line of code.

@khawajasim
Copy link
Contributor Author

I added the check to spatial_count and spatial_magnitude_count functions. So now it should work alright.

I wanted to know, how do you treat the right corner event? I mean, the earthquakes occurring at lon of 180 or lat of 90?

@khawajasim
Copy link
Contributor Author

Now, I have completed integration of quadtree grid into pyCSEP. It is a basic working version for global testing region.
Please have a look at the code, and let me know if there are any suggetions.

Next steps are:
-Improve the implementation of some functions
-Integrate California quadtree grid

@wsavran
Copy link
Collaborator

wsavran commented Apr 21, 2021

great, lets go over it on the call tomorrow! can you also make sure to add mercantile (and any other extra requirements) to the requirements.txt and requirements.yaml files? before ill be able to merge, the build will need to be successful.

@wsavran
Copy link
Collaborator

wsavran commented Apr 21, 2021

I added the check to spatial_count and spatial_magnitude_count functions. So now it should work alright.
great!

I wanted to know, how do you treat the right corner event? I mean, the earthquakes occurring at lon of 180 or lat of 90?

the point at 180 should map to the bin at -180, but we should probably make an additional check, maybe in the constructor of the catalog class. this would enforce a convention about how lat/lon are reported by a catalog instead of trying to assume how to handle these points during binning.

i'm not sure how exact we want to handle a lat of 90, because technically that is a point. conceptually, i think if an earthquake were to occur directly at lat = 90, we would want to place it in the [89.9, 90.0) bin.

what do you think?

@khawajasim
Copy link
Contributor Author

I personally would like to go with the first suggestion. The point on 180 should belong to -180, both are same in actual. I would prefer to make this a constructor in Catalog class. May be change 180 to -180. What would you say?

@khawajasim
Copy link
Contributor Author

I added requirements of mercantile and area packages. But its still not resolving the issue of building package.
I am not sure why.

@wsavran
Copy link
Collaborator

wsavran commented Apr 26, 2021

it seems to be breaking during the cartopy build. yoou can look at the error message from the failing build by going here and looking through the terminal output from the build. i took a look and it seems like a cartopy issue, but take a look and let me know what you think.

also, it looks like the area package isn't available on conda-forge. either we need to make a feedstock for this package (which might be more work than we want at the moment; although id be happy to help you set this up if you'd like to maintain the feedstock) or we should implement this functionality into the package. it only seems like we are using it to compute the area, which could probably be done with the pyproj function that is already implemented. i think the best way forward will be to implement an area function in regions module.

@khawajasim
Copy link
Contributor Author

khawajasim commented Apr 26, 2021

I have a question. Would that be ethical and a good idea to directly copy the implementation of area from package to our implementation? If that is alright, then I can simply do that. Or if its a problem, then I would think of something else. The implementation is available here:

https://github.com/scisco/area

@wsavran
Copy link
Collaborator

wsavran commented Apr 26, 2021

that's a good question. from a legal standpoint, it depends on the license of the software package. i think that bsd-2 allows us to do this, see the license here. id recommend we write our own area calculation and associated unit-tests. im not normally about re-inventing the wheel, but this is a standad and easy enough to implement, we should probably do it.

looking at their code, it seems like something we could easily implement as well. i tagged you in a pull request from @hanbao-ucla. i think we should implement a get_area function on the region class. i think you already wrote this, but left it as a stand-alone function in the regions module. this function would return a numpy array where each element contains the area of the corresponding polygon.

what do you think?

@hanbao-ucla
Copy link
Contributor

Yeah that would be neat, as sometimes we need both 'rate [eg., eq/day/cell or eq/year/Earth]' and 'rate_density [eg. eq/year/km^2]'

@khawajasim
Copy link
Contributor Author

Hi @wsavran
I feel that the code build is failing due to following erros.
Can you guide me with this error?

from csep.core.regions import CartesianGrid2D, Polygon, create_space_magnitude_region
E ImportError: cannot import name 'CartesianGrid2D' from 'csep.core.regions' (/home/runner/work/pycsep/pycsep/csep/core/regions.py)

from csep.core.regions import QuadtreeGrid2D
E ImportError: cannot import name 'QuadtreeGrid2D' from 'csep.core.regions' (/home/runner/work/pycsep/pycsep/csep/core/regions.py)

@khawajasim
Copy link
Contributor Author

I wrote quadtree forecast reader in csep.utils.reader. Thus I needed to call QuadtreeGrid2D from csep.core.regions in readers, which caused circular imports. So I tried to rectify this by calling QuadtreeGrid2D inside the reader function. For now, I did this, as we agreed, but lets discuss sometime to improve structure of forecast reading. As the forecast handling remains same in general. We just have to call a different region,
region = QuadtreeGrid2D.fromquadkeys(qk), instead of
region = CartesianGrid2D([Polygon(bbox) for bbox in bboxes], dh, mask=poly_mask).

The forecast handling remains same overall. I believe forecast reading can also be done in the previous same function that you wrote, by introducing an if/else statement.

@khawajasim
Copy link
Contributor Author

As for documentation, I wrote documentation in docs/concepts/forecasts.rst and docs/concepts/regions.rst.

I can not build them on my PC. I am not sure whether I was doing it the correct way or not. But I am sure, if you accept the changes in the docs and run "make html", it should work.

@wsavran
Copy link
Collaborator

wsavran commented Jul 12, 2021

hi @khawajasim, i got my github 2fa issues sorted and im back online. ill test out the build tonight and we can chat about it on the call tomorrow. thanks for spending some time to write the documentation.

@wsavran wsavran changed the base branch from master to v0.6.0-rc January 11, 2022 23:11
@wsavran
Copy link
Collaborator

wsavran commented Jan 11, 2022

after more than enough time, im happy to say that im merging this pull request into the v0.6.0 release candidate branch

@wsavran wsavran merged commit 378ed35 into SCECcode:v0.6.0-rc Jan 11, 2022
@wsavran wsavran mentioned this pull request Jan 11, 2022
wsavran added a commit that referenced this pull request Feb 4, 2022
* Quadtree region class - QuadtreeGrid2D (#115)
* Added function to read QuadtreeGrid forecast
* added spatial_count and spatio_mag_count functions
* poission_evaluations using from QuadtreeGrid2D
* unit tests for quadtree-grid
* California quadtreegrid at L=12 added
* california quadtreegrid loading function
* setup.py modified-added mercantile
* forecast.plot() for Single-res Quadtree
* clean up unnecessary comments from code
* Update python-app.yml (#170)
* added documentation for roc and i1 score (#169)
* Fixed dh issue by modifying get_bbox() (#175)

Co-authored-by: khawajasim <asimkhawaja786@gmail.com>
Co-authored-by: William Savran <wsavran@usc.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

To perform binning on observed catalog for variable sized spatial grids.
4 participants